A Hybrid Disambiguation Measure for Inaccurate Cultural Heritage Data

نویسندگان

  • Julia Efremova
  • Bijan Ranjbar Sahraei
  • Toon Calders
چکیده

Cultural heritage data is always associated with inaccurate information and different types of ambiguities. For instance, names of persons, occupations or places mentioned in historical documents are not standardized and contain numerous variations. This article examines in detail various existing similarity functions and proposes a hybrid technique for the following task: among the list of possible names, occupations and places extracted from historical documents, identify those that are variations of the same person name, occupation and place respectively. The performance of our method is evaluated on three manually constructed datasets and one public dataset in terms of precision, recall and F-measure. The results demonstrate that the hybrid technique outperforms current methods and allows to significantly improve the quality of cultural heritage data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Data Model for Multimedia Database of Malaysian Cultural Heritage Artefacts Management

Vast amount of multimedia especially in the cultural heritage domain needed a special data model to cater the need of both collection management and digital multimedia content. Multimedia database for cultural heritage application are currently active research activities, however limited research has been done in the local context. The intention of this research is to design a data model for mu...

متن کامل

Exploring entity recognition and disambiguation for cultural heritage collections

Unstructured metadata fields such as ‘description’ offer tremendous value for users to understand cultural heritage objects. However, this type of narrative information is of little direct use within a machine-readable context due to its unstructured nature. This paper explores the possibilities and limitations of Named-Entity Recognition (NER) and Term Extraction (TE) to mine such unstructured...

متن کامل

بررسی و تبیین سیر تحول سیاست‌ها و برنامه‌های حفاظت از میراث فرهنگی، بازۀ زمانی سال 1357تا1391

There are different factors effective in making policies and plans for conservation of cultural heritage in Iran. This paper aims to study and review structural changes of Cultural Heritage Organization of Iran which were made by important cultural, social, political, and economic changes and events of Iran and have a significant role in the process of making policies and plans for conservation...

متن کامل

Ontology-based Disambiguation of Spatiotemporal Locations

Geographic place names are semantically often highly ambiguous. For example, there are 491 places in Finland sharing the same name ”Isosaari” (great island) that are instances of several geographical classes, such as Island, Forest, Peninsula, Inhabited area, etc. Referencing unambiguously to a particular ”Isosaari”, either when annotating content or during information retrieval, can be quite p...

متن کامل

A software system to work with 3D models in cultural heritage research*

The availability of intuitive, user-friendly, and specialized software to work with 3D models of cultural heritage artefacts is as important as the availability of low-cost and robust data acquisition techniques for the adoption of digitized 3D models in cultural heritage research. The number of available high-quality digitized artefacts increases rapidly with the advent of low-cost 3D scanning...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014